Asymptotics for sliced average variance estimation
نویسندگان
چکیده
In this paper, we systematically study the consistency of sliced average variance estimation (SAVE). The findings reveal that when the response is continuous, the asymptotic behavior of SAVE is rather different from that of sliced inverse regression (SIR). SIR can achieve √ n consistency even when each slice contains only two data points. However, SAVE cannot be √ n consistent and it even turns out to be not consistent when each slice contains a fixed number of data points that do not depend on n, where n is the sample size. These results theoretically confirm the notion that SAVE is more sensitive to the number of slices than SIR. Taking this into account, a bias correction is recommended in order to allow SAVE to be √ n consistent. In contrast, when the response is discrete and takes finite values, √ n consistency can be achieved. Therefore, an approximation through discretization, which is commonly used in practice, is studied. A simulation study is carried out for the purposes of illustration. 1. Introduction. Dimension reduction has become one of the most important issues in regression analysis because of its importance in dealing with problems with high-dimensional data. Let Y and x = (x 1 ,. .. , x p) T be the response and p-dimensional covariate, respectively. In the literature, when Y depends on x = (x 1 ,. .. , x p) T through a few linear combinations B T x of x, where B = (β 1 ,. .. , β k), there are several proposed methods for estimating the projection directions B/space that is spanned by B, such as projection pursuit regression (PPR) [11], the alternating conditional expectation (ACE) method [1], principal Hessian directions (pHd) [17], minimum average variance estimation (MAVE) [23], iterated pHd [7] and profile least-squares
منابع مشابه
Likelihood-based Sufficient Dimension Reduction
We obtain the maximum likelihood estimator of the central subspace under conditional normality of the predictors given the response. Analytically and in simulations we found that our new estimator can preform much better than sliced inverse regression, sliced average variance estimation and directional regression, and that it seems quite robust to deviations from normality.
متن کاملA note on extension of sliced average variance estimation to multivariate regression
Rand Corporation, Pittsburgh, PA 15213 e-mail: [email protected] Abstract: Many sufficient dimension reduction methodologies for univariate regression have been extended to multivariate regression. Sliced average variance estimation (SAVE) has the potential to recover more reductive information, and recent development enables us to test the dimension and predictor effects with distributions comm...
متن کاملOn the distribution of the left singular vectors of a random matrix and its applications
In several dimension reduction techniques, the original variables are replaced by a smaller number of linear combinations. The coefficients of these linear combinations are typically the elements of the left singular vectors of a random matrix. We derive the asymptotic distribution of the left singular vectors of a random matrix that has a normal limit distribution. This result is then used to ...
متن کاملOn model-free conditional coordinate tests for regressions
Existing model-free tests of the conditional coordinate hypothesis in sufficient dimension reduction (Cook (1998) [3]) focused mainly on the first-order estimation methods such as the sliced inverse regression estimation (Li (1991) [14]). Such testing procedures based on quadratic inference functions are difficult to be extended to second-order sufficient dimension reduction methods such as the...
متن کاملGraphical Methods for Class Prediction Using Dimension Reduction Techniques on DNA Microarray Data
MOTIVATION We introduce simple graphical classification and prediction tools for tumor status using gene-expression profiles. They are based on two dimension estimation techniques sliced average variance estimation (SAVE) and sliced inverse regression (SIR). Both SAVE and SIR are used to infer on the dimension of the classification problem and obtain linear combinations of genes that contain su...
متن کامل